San Francisco New York I[e),avie “Shanghai _ Berlin - |[eyalelo 


These are confidential sessions—please refrain from streaming, blogging; or taking pictures -,. 


Advances in OpenGL ES 3.0 


Filip Iliescu 
Graphics and Media Evangelist 
filiescu@apple.com 


i avexX=mr-]k=m@e) abi le (=1ai4r-] mcXowos9 10) plcteeed ©)(sr-kYom (oii tl ami ce)anmcalcey-]anliace pm e)(ore(ellarepmelant-| dlalem e)ieinul cass 


-ecceeS. 9:41 AM 100% = 


Sa0c 


_' Messages ". Calendar Photos . Camera 


| Maps 


Apple A7 Processor 


PN) 0) (= Wan od £01 eS10) f OpenGL ES 3.0 


Apple A7 Processor OpenGL ES 3.0 Xcode 5 
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¢ Moving from ES2 to ES3 

¢ Deeper dive into ES3 

¢ Tuning using Xcode 5 OpenGL ES Debugger 
= New Shader Profiler for A7 GPU 
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Overview 


¢ Tile Based Deferred Renderer (TBDR) 
¢ Up to 2x graphics performance 

" Compared to Aé (iphone 5) 
¢ Fully native OpenGL ES 3.0, 2.0 
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= ES1.1 backwards compatibility 
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2D Texture arrays 
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Primitive Restart 
Sampler Objects 


| GLSL ES 1.00 
Vertex Array Objects Immutable Texture Storage 


Multiple Render Targets sue Uniform Buffer Objects 
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More Texture Units 


PVRTC 
ETC2/EAC 
Vertex Texture Fetch 
1 & 2 component textures 
Per-Line Shader Performance Metrics O C |e . 
Render to Mipmap level 
Half-Float Color Buffers 
Fast texture copy 


Seamless cubemap filtering 


Separate Shader Objects 


CRY ARS, Map Buffer Range Framebuffer Fetch 


Non Power of Two Textures 


Vertex Buffer Objects MSAA Render to Texture EID NID GIs MSAA Render to Texture 
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Occlusion Query 


OpenGL ES Limits 
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OpenGL ES Attributes pV 9) ©) (= AWA SGX 554 Fe context 

Max Texture Image Units 16 3) 3) 

NW/Feba ro) anlo)iavsre mi k=).4a0 com lant-(e(ome@l alias 32 3) 3) 

Max Vertex Texture Image Units 16 3) 3) 

Max Vertex Uniform Vectors 512 128 128 

WiFebal esl] aal=laven Os alicelaaaccrel re) a 224 64 64 

Max Varying Vectors 15 3) 3) 
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Max Texture & Renderbuffer Size 4096 x 4096 4096 x 4096 4096 x 4096 


Key Differences From A6 


¢ Performance 

= No penalty for dependent texture reads 

- Higher penalty for frame buffer loads and stores 
¢ Precision 
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: All FP shader calculations performed with scalar processor 
ma miaalies 

= Apps with ES2 context get ES2 limits 

= Apps with ES3 context get ES3 limits 


Moving from ES2 to ES3 


The Big Picture 


a Ge) i= 
= ES2 Compatible with ES3 API 
= ES2 is Subset of ES3 
¢ Extensions (3 cases) 
= Some have moved into the ES3 core as-is 
= Some move into ES3 core with semantic changes 
= Some extensions in ES2 are still extensions in ES3 


Case #1 


ES2 extensions promoted directly to ES3 core 


¢ These work identically in ES2 and ES3 
¢ Just remove EXT, APPLE, OES API suffixes 


mn) msm e(=1 0100 02-. = EXT_draw_instanced 

= OES_element_index_uint = EXT_instanced_arrays 

= OES fbo_render_mipmap - EXT_map_buffer_range 

* OES _rgb8_rgba8 = EXT_occlusion_query_boolean 
= OES_texture_half_float_linear om = ,@ a =), 400| 1 C0) ¢- 0 [= 

= OES_vertex_array_object = APPLE_sync 
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Case #1 Examples 


*EXT_texture_storage 


¢EXT_map_buffer_range 


ES2: glMapBufferRangeEXT (GL_ARRAY_BUFFER, offset, length, GL_MAP_WRITE_BIT_EXT | 
GL_MAP_FLUSH_EXPLICIT_BIT_EXT | GL_MAP_UNSYNCHRONIZED_BIT_EXT ); 


ES3: glMapBufferRange (GL_ARRAY BUFFER, offset, length, GL_MAP_WRITE_BIT | 
GL_MAP_ FLUSH EXPLICIT BIT | GL_MAP_UNSYNCHRONIZED_BIT ); 


Case #2 


ES2 extensions promoted with API changes 


=OES_mapbuftfer 

- EXT_discard_framebuftfer 

= APPLE_framebuffer_multisample 
"OES _depth_texture 
"OES_packed_depth_stencil 

= OES_texture_float 

= OES_texture_half_float 

= EXT_texture_rg 

= EXT_sRGB 


Case #2 Examples 


©OES_ mapbuffer 
ES2: map = glMapBufferOES (GL_ARRAY_BUFFER, GL_WRITE_ONLY_OES) ; 


ES3: map = glMapBufferRange(GL_ARRAY_BUFFER, @, size, GL_MAP_WRITE_BIT); 


¢ EXT discard framebuffer 


ES2: glDiscardFramebufferEXT(GL_FRAMEBUFFER, count, attachments); 
ES3: gliInvalidateFramebuffer(GL_FRAMEBUFFER, count, attachments) ; 


¢ APPLE_framebuffer_multisample 
ES2: glResolveMultisampleFramebuf ferAPPLE() ; 


ES3: glBlitFramebuffer(0,0,w,h, ®,®,w,h, GL_COLOR_BUFFER_BIT, GL_NEAREST) ; 


Case #3 


Some extensions in ES2 are still extensions in ES3 


¢ Check GL_EXTENSIONS 
- APPLE_copy_texture_levels 
= APPLE_rgb_422 
= APPLE_texture_format_BGRA_ 8888 
= EXT_color_buffer_half_float 
= EXT_debug_label 
= EXT_debug_marker 


= EXT_read_format_bgra 
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" EXT_shader_framebuffer_fetch 

- EXT_texture_filter_anisotropic 

= IMG_read_format 
*IMG_texture_compression_pvrtc 
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¢ Fully supported 
= ES2-style shaders are compatible with both ES2 and ES3 
= Version 100 assumed if no #version specified 
¢ #version 300 es 
am iVirelahvantelaveler-lerom-le(eliuie) aim] aroma ar-lale [as 
« Similar to desktop GLSL 330 
= Video:“Migrating to OpenGL Core Profile” 


Adoption Strategy 


¢ Now 
- Test your ES2 based games on iPhone 5s 
- Especially, correct any logical buffer loads/stores 
¢ Next 
= Support both ES2 and ES3 
: Try for an ES3 context, fall back to ES2 if not available 
- Handle extension APIs conditionally at runtime 
¢Some games: Go ES3 only 
- Games requiring ES3 features, deferred shading, etc. 
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2D Texture arrays 


Program Performance View Instanced Rendering ie 

Deferred Lighting / Shading Render to Texture Vertex Array Objects ena Immutable Texture Storage 
Multiple Renaer Targets woe Uniform Buffer Objects 
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ETC2/EAC Compressed Textures 
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Render to Mipmap level 


New Buffer Formats 


Per-Line Shader Performance Metrics 
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Fast texture copy 


Seamless cubemap filtering 


Separate Shader Objects 


GLSL ES 3.00 Map Buffer Range Framebuffer fetch 


Non Power of Two Textures 
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Extended Indices Tr Cns f orm f ee | aye C k Pixel Buffer Objects 


Occlusion Query 


2D Texture arrays 


Program Performance View akivela axe Rendering 


Primitive Restart 


Sampler Objects 


| GLSL ES 1.00 
Vertex Array Objects Immutable Texture Storage 


Multiple Render Targets wee Uniform Buffer Objects 


3D Textures 


Deferred Lighting / Shading Render to Texture 


PVRTC 
ETC2/EAC Compressed Textures 


C ) |e = S 1 & 2 component textures 
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Render to Mipmap level 


New Buffer Formats 


Per-Line Shader Performance Metrics 
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Fast texture copy 
Seamless cubemap filtering 


Separate Shader Objects 


GLSL ES 3.00 Map Buffer Range Framebuffer fetch 


Non Power of Two Textures 


Texture Uni 
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Occlusion Query 
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Drawing Many Objects 
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// Draw asteroids 
for (x=; x < NUM_ASTEROIDS; x++) 
{ 


Takiv-lare=reM a-vare(salare 
Faster way to draw many similar objects 


¢ Draws the same object many times 
: All in a single draw call 

¢ Each can have different parameters 
mmesiieeas 
= Rotations 
a t=>.40|comaele)cellareleas 
" etc. 


¥ 
L } 


“eagle SE: fat eee eas 
Gia Faye We ge 
. a tr ye a Gen: . 


Takir-latecvem av-jale(svalare 
Two forms 


¢Instanced arrays 
: All instance parameters stored in an attribute array 
¢ Shader instance ID 
- Instance parameters derived from gl_InstanceID in vertex shader 
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= ES3: In the ES3 core 
= ES2: GL_APPLE_instanced_arrays, GL_APPLE_draw_instanced 
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Vertex shader 


¢ gl_InstancelD incremented for each instance 
°0,1,2,3,..1 

¢ You take it from there 
mn ULy-m | Deka) olelacemer-|(e0lt-lelelamlamiar-le(=1 
on UX | Dn(o) mu (ole) 40] OM [am@lalixe)aanm>10ii(21m@)e)(-1e1m (01510) 
= Use ID for lookup with Vertex Texture Sampling 
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// Vertex attributes for one asteroid 
glVertexAttribPointer(@, ..., vertices); 


glVertexAttribPointer(1, ..., normals); 
glVertexAttribPointer(2, ..., colors); 


// Uniforms for all 
glUniformMatrix4fv(modelViewProjectionMatrix) ; 


// All in one draw call 


g LDrawArraysInstanced(GL_TRIANGLES, @, NUM_VERTICES, NUM _ASTEROIDS) ; 
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#version 300 es 
in vec4 inPos; 
in vec3 inNorm, inColor; 


uniform float spacing; 
uniform mat4 cameraMVP; 


void main() 
sf 


vec4 pos = inPos; 


ivec2 instancePosition = ivec2(gl_InstanceID % 100, gl_InstanceID / 100); 
pos.xy += vec2(instancePosition) * spacing; 


gl_Position = cameraMVP x pos; 


Takir-lavecrem acsyale(sualare 
DY-Taatom cmele)iatemanlereamantel c= 


¢ Instance ID 
om ULX-10 KOM (ole) 40] OManT- 100) a Amel D1e 
" Used as seed for spin rate per-asteroid 
Ul alixelganmoielic-)meje)(-var 
am (O)(ebsmae- lake) aant-)d(e)apnee)(e) mer-le-Irelm=r-lel aml alir-] alee 
am aalieave mi74= 
¢ Transform feedback & rasterize discard 
- Vertex stage only 
- Used to populate the UBO at startup with model view matrix, etc... 
= reduces per vertex calculations to per instance 


Multiple Render Targets 


Multiple Render Targets 


Concept 


¢ Render to multiple textures or 
calare(olaelelic=lamicelaam-mvlalells 
draw call 
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shader 
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¢ Enables deferred lighting/ 
saree fale pmelaal=) ms)iravelas 


¢ Each attachment’s format can 
olinxs)amiagelaamay-leameleal=le 


¢ 128 bits per pixel 
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Using Multiple Render Targets 
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Using Multiple Render Targets 


Multiple Render Targets 
Setup 


// Define 4 color attachments for currently bound framebuffer 
GLenum renderbuffers[] = t GL_COLOR_ATTACHMENT@, GL_COLOR_ATTACHMENT1, 
GL_COLOR_ATTACHMENT2, GL_COLOR_ATTACHMENTS3 }; 


// Attach textures as output buffers 

g LFramebuf ferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT®@, colorTex, Q); 

g LFramebuf ferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT1, normalTex, Q); 
g LFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT2, depthTex, Q); 


g lFramebuf ferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT3, albedoTex, Q); 


// Tell GL to enable buffers to draw into 
glDrawBuffers(4, renderbuffers) ; 


// Draw 
glDrawElements(...); 


Multiple Render Targets 


Fragment shader 


#version 300 es 


Layout (location out lowp vec4 fs_color_output; 
layout(location = out lowp vec4 fs_normal_output; 


layout( location = out highp uint fs_depth_output; 
layout( location = out lowp vec4 fs_albedo_ output; 


void main(void) 

{ 
PSxeG lO sOUtOUM. = aac 
fs_normal_output =... 


fs_depth_output 
fs_albedo_output =... 


Multiple Render Targets 
For Deferred Shading 
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ma KONVALe (sl Acs) almelsrielar-lele)amee)(e)mlamir-lelantsialmrjar-le(=1i 
¢ Syntax 
- Built-in variable in #version 100 shaders 
gl_LastFragData[@] 
= User-declared in #version 300 es 
Layout( location = Q) inout lLowp vec4 my_destination_name; 
¢ Useful for 
ma cole le-lanlaar-lelicmel(-vareniare 
- Local post-processing effects 
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Framebuffer Fetch 
With Multiple Render Targets 
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= Read with framebufter fetch 
= Write with MRT 

¢ Read from one, write to another 
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Deferred Shading in One Pass 
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Compute 
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Clean up 
and Present 


Deferred Shading in One Pass 
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Clean up 
and Present 


Deferred Shading in One Pass 
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Clean up 
and Present 


Deferred Shading in One Pass 
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Three stages 


¢ Multiple Render Targets 
= Render G-buffer attachments in one pass 
- formats can vary between attachments 

¢ Framebuffer fetch 
= Render deferred lights in the same pass 


= Read from all attachments, write to one 
: Gbuffer becomes per-pixel scratch space 


¢ Framebuffer invalidate 
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OpenGL ES Tools 
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